9 research outputs found
Tree Oriented Data Analysis
Complex data objects arise in many areas of modern science including
evolutionary biology, nueroscience, dynamics of gene expression and medical
imaging. Object oriented data analysis (OODA) is the statistical analysis of
datasets of complex objects. Data analysis of tree data objects is an exciting
research area with interesting questions and challenging problems. This thesis
focuses on tree oriented statistical methodologies, and algorithms for solving
related mathematical optimization problems.
This research is motivated by the goal of analyzing a data set of images of
human brain arteries. The approach we take here is to use a novel
representation of brain artery systems as points in phylogenetic treespace. The
treespace property of unique global geodesics leads to a notion of geometric
center called a Fr\'echet mean. For a sample of data points, the Fr\'echet
function is the sum of squared distances from a point to the data points, and
the Fr\'echet mean is the minimizer of the Fr\'echet function.
In this thesis we use properties of the Fr\'echet function to develop an
algorithmic system for computing Fr\'echet means. Properties of the Fr\'echet
function are also used to show a sticky law of large numbers which describes a
surprising stability of the topological tree structure of sample Fr\'echet
means at that of the population Fr\'echet mean. We also introduce
non-parametric regression of brain artery tree structure as a response variable
to age based on weighted Fr\'echet means.Comment: PhD thesis, University of North Carolina, 201
Persistent homology analysis of brain artery trees
New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set. The correlation with age continues to be significant even after controlling for correlations from earlier significant summaries
Sticky central limit theorems on open books
Given a probability distribution on an open book (a metric space obtained by
gluing a disjoint union of copies of a half-space along their boundary
hyperplanes), we define a precise concept of when the Fr\'{e}chet mean
(barycenter) is sticky. This nonclassical phenomenon is quantified by a law of
large numbers (LLN) stating that the empirical mean eventually almost surely
lies on the (codimension and hence measure ) spine that is the glued
hyperplane, and a central limit theorem (CLT) stating that the limiting
distribution is Gaussian and supported on the spine. We also state versions of
the LLN and CLT for the cases where the mean is nonsticky (i.e., not lying on
the spine) and partly sticky (i.e., is, on the spine but not sticky).Comment: Published in at http://dx.doi.org/10.1214/12-AAP899 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Evaluating genetic markers and neurobiochemical analytes for fluoxetine response using a panel of mouse inbred strains
RationaleIdentification of biomarkers that establish diagnosis or treatment response is critical to the advancement of research and management of patients with depression.ObjectiveOur goal was to identify biomarkers that can potentially assess fluoxetine response and risk to poor treatment outcome.MethodsWe measured behavior, gene expression, and the levels of 36 neurobiochemical analytes across a panel of genetically diverse mouse inbred lines after chronic treatment with water or fluoxetine.ResultsGlyoxylase 1 (GLO1) and guanine nucleotide-binding protein 1 (GNB1) mostly account for baseline anxiety-like and depressive-like behavior, indicating a common biological link between depression and anxiety. Fluoxetine-induced biochemical alterations discriminated positive responders, while baseline neurobiochemical differences differentiated negative responders (p < 0.006). Results show that glial fibrillary acidic protein, S100 beta protein, GLO1, and histone deacetylase 5 contributed most to fluoxetine response. These proteins are linked within a cellular growth/proliferation pathway, suggesting the involvement of cellular genesis in fluoxetine response. Furthermore, a candidate genetic locus that associates with baseline depressive-like behavior contains a gene that encodes for cellular proliferation/adhesion molecule (Cadm1), supporting a genetic basis for the role of neuro/gliogenesis in depression.ConclusionWe provided a comprehensive analysis of behavioral, neurobiochemical, and transcriptome data across 30 mouse inbred strains that has not been accomplished before. We identified biomarkers that influence fluoxetine response, which, altogether, implicate the importance of cellular genesis in fluoxetine treatment. More broadly, this approach can be used to assess a wide range of drug response phenotypes that are challenging to address in human samples.Electronic supplementary materialThe online version of this article (doi:10.1007/s00213-011-2574-z) contains supplementary material, which is available to authorized users